Overview

Dataset statistics

Number of variables34
Number of observations467
Missing cells41
Missing cells (%)0.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory124.2 KiB
Average record size in memory272.3 B

Variable types

CAT18
NUM16

Reproduction

Analysis started2020-08-19 05:58:58.264927
Analysis finished2020-08-19 06:00:01.885648
Duration1 minute and 3.62 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

year has constant value "2015" Constant
name has a high cardinality: 465 distinct values High cardinality
age has a high cardinality: 61 distinct values High cardinality
streetaddress has a high cardinality: 459 distinct values High cardinality
city has a high cardinality: 364 distinct values High cardinality
namelsad has a high cardinality: 389 distinct values High cardinality
lawenforcementagency has a high cardinality: 377 distinct values High cardinality
share_white has a high cardinality: 363 distinct values High cardinality
share_black has a high cardinality: 246 distinct values High cardinality
share_hispanic has a high cardinality: 293 distinct values High cardinality
p_income has a high cardinality: 452 distinct values High cardinality
pov has a high cardinality: 281 distinct values High cardinality
geo_id is highly correlated with state_fp and 1 other fieldsHigh correlation
state_fp is highly correlated with geo_id and 1 other fieldsHigh correlation
county_id is highly correlated with state_fp and 1 other fieldsHigh correlation
nat_bucket is highly correlated with h_incomeHigh correlation
h_income is highly correlated with nat_bucketHigh correlation
county_bucket has 27 (5.8%) missing values Missing
name is uniformly distributed Uniform
streetaddress is uniformly distributed Uniform
city is uniformly distributed Uniform
namelsad is uniformly distributed Uniform
lawenforcementagency is uniformly distributed Uniform
share_white is uniformly distributed Uniform
p_income is uniformly distributed Uniform
pov is uniformly distributed Uniform

Variables

name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count465
Unique (%)99.6%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
Unknown
 
3
Karen Janks
 
1
Daryl Myler
 
1
Darin Hutchins
 
1
Carl Lao
 
1
Other values (460)
460
ValueCountFrequency (%) 
Unknown30.6%
 
Karen Janks10.2%
 
Daryl Myler10.2%
 
Darin Hutchins10.2%
 
Carl Lao10.2%
 
Don Smith10.2%
 
Jeff Alexander10.2%
 
Jeffrey Pitts10.2%
 
Wendell King10.2%
 
Cornelius Parker10.2%
 
Other values (455)45597.4%
 

Length

Max length33
Median length13
Mean length13.83511777
Min length7

age
Categorical

HIGH CARDINALITY

Distinct count61
Unique (%)13.1%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
29
 
18
34
 
18
35
 
18
36
 
17
39
 
17
Other values (56)
379
ValueCountFrequency (%) 
29183.9%
 
34183.9%
 
35183.9%
 
36173.6%
 
39173.6%
 
31173.6%
 
37163.4%
 
24163.4%
 
26163.4%
 
28153.2%
 
Other values (51)29964.0%
 

Length

Max length7
Median length2
Mean length2.042826552
Min length2

gender
Categorical

Distinct count2
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
Male
445
Female
 
22
ValueCountFrequency (%) 
Male44595.3%
 
Female224.7%
 

Length

Max length6
Median length4
Mean length4.094218415
Min length4

raceethnicity
Categorical

Distinct count6
Unique (%)1.3%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
White
236
Black
135
Hispanic/Latino
67
Unknown
 
15
Asian/Pacific Islander
 
10
ValueCountFrequency (%) 
White23650.5%
 
Black13528.9%
 
Hispanic/Latino6714.3%
 
Unknown153.2%
 
Asian/Pacific Islander102.1%
 
Native American40.9%
 

Length

Max length22
Median length5
Mean length6.948608137
Min length5

month
Categorical

Distinct count6
Unique (%)1.3%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
March
114
April
96
January
90
February
84
May
81
ValueCountFrequency (%) 
March11424.4%
 
April9620.6%
 
January9019.3%
 
February8418.0%
 
May8117.3%
 
June20.4%
 

Length

Max length8
Median length5
Mean length5.573875803
Min length3

day
Real number (ℝ≥0)

Distinct count31
Unique (%)6.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.830835117773018
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum1
5-th percentile2
Q18
median16
Q323
95-th percentile29
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.658969893
Coefficient of variation (CV)0.5469686108
Kurtosis-1.167610677
Mean15.83083512
Median Absolute Deviation (MAD)7
Skewness0.002419749851
Sum7393
Variance74.9777596
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
21234.9%
 
15224.7%
 
17224.7%
 
8214.5%
 
27214.5%
 
6194.1%
 
26173.6%
 
4173.6%
 
19163.4%
 
16163.4%
 
Other values (21)27358.5%
 
ValueCountFrequency (%) 
1132.8%
 
2132.8%
 
3153.2%
 
4173.6%
 
5102.1%
 
ValueCountFrequency (%) 
3191.9%
 
30143.0%
 
29102.1%
 
28153.2%
 
27214.5%
 

year
Categorical

CONSTANT
REJECTED

Distinct count1
Unique (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
2015
467
ValueCountFrequency (%) 
2015467100.0%
 

Length

Max length4
Median length4
Mean length4
Min length4

streetaddress
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count459
Unique (%)99.1%
Missing4
Missing (%)0.9%
Memory size3.6 KiB
I-10
 
2
4999 Naaman Forest Blvd
 
2
I-95
 
2
E Baseline Rd and S 48th St
 
2
2853 Avalon Meadows Ct
 
1
Other values (454)
454
ValueCountFrequency (%) 
I-1020.4%
 
4999 Naaman Forest Blvd20.4%
 
I-9520.4%
 
E Baseline Rd and S 48th St20.4%
 
2853 Avalon Meadows Ct10.2%
 
Hectorville Rd and Bixby Rd10.2%
 
3800 Vernon Ave10.2%
 
2800 Longmeadow Dr10.2%
 
Fern Ave and Phillips St10.2%
 
331 W Grand Ledge Hwy10.2%
 
Other values (449)44996.1%
 
(Missing)40.9%
 

Length

Max length65
Median length16
Mean length17.93361884
Min length3

city
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count364
Unique (%)77.9%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
Los Angeles
 
9
Phoenix
 
6
Houston
 
6
Oklahoma City
 
4
San Francisco
 
4
Other values (359)
438
ValueCountFrequency (%) 
Los Angeles91.9%
 
Phoenix61.3%
 
Houston61.3%
 
Oklahoma City40.9%
 
San Francisco40.9%
 
New York40.9%
 
Indianapolis40.9%
 
Tulsa40.9%
 
Harvey30.6%
 
Fort Worth30.6%
 
Other values (354)42089.9%
 

Length

Max length22
Median length8
Mean length8.640256959
Min length4

state
Categorical

Distinct count47
Unique (%)10.1%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
CA
74
TX
 
46
FL
 
29
AZ
 
25
OK
 
22
Other values (42)
271
ValueCountFrequency (%) 
CA7415.8%
 
TX469.9%
 
FL296.2%
 
AZ255.4%
 
OK224.7%
 
GA163.4%
 
NY143.0%
 
CO122.6%
 
IL112.4%
 
NJ112.4%
 
Other values (37)20744.3%
 

Length

Max length2
Median length2
Mean length2
Min length2

latitude
Real number (ℝ≥0)

Distinct count462
Unique (%)98.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.40322413147751
Minimum19.915194
Maximum61.218408
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum19.915194
5-th percentile28.66026672
Q133.3352402
median35.7697792
Q339.9374522
95-th percentile44.94167621
Maximum61.218408
Range41.303214
Interquartile range (IQR)6.602212

Descriptive statistics

Standard deviation5.193356785
Coefficient of variation (CV)0.1426620007
Kurtosis1.675106283
Mean36.40322413
Median Absolute Deviation (MAD)3.3626404
Skewness0.3541898227
Sum17000.30567
Variance26.9709547
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
32.407138830.6%
 
40.8630620.4%
 
33.378193820.4%
 
32.959940620.4%
 
38.946450110.2%
 
61.207557710.2%
 
35.37617510.2%
 
37.563148510.2%
 
35.164833710.2%
 
34.628594410.2%
 
Other values (452)45296.8%
 
ValueCountFrequency (%) 
19.91519410.2%
 
21.306512910.2%
 
21.310643510.2%
 
21.933860810.2%
 
25.155902910.2%
 
ValueCountFrequency (%) 
61.21840810.2%
 
61.207557710.2%
 
49.000011410.2%
 
48.708541910.2%
 
47.837775110.2%
 

longitude
Real number (ℝ)

Distinct count462
Unique (%)98.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-96.97266602248393
Minimum-159.64270019999998
Maximum-68.1000068
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum-159.6427002
5-th percentile-122.285112
Q1-111.9546359
median-94.7619019
Q3-82.96158185
95-th percentile-74.26755766
Maximum-68.1000068
Range91.5426934
Interquartile range (IQR)28.99305405

Descriptive statistics

Standard deviation16.95384164
Coefficient of variation (CV)-0.1748311389
Kurtosis0.1159338992
Mean-96.97266602
Median Absolute Deviation (MAD)13.3324662
Skewness-0.6587733046
Sum-45286.23503
Variance287.4327465
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-96.073890730.6%
 
-96.638956120.4%
 
-111.97845220.4%
 
-74.011291520.4%
 
-95.15651710.2%
 
-77.720633210.2%
 
-92.293471210.2%
 
-82.638099710.2%
 
-111.985183710.2%
 
-106.439617810.2%
 
Other values (452)45296.8%
 
ValueCountFrequency (%) 
-159.642700210.2%
 
-157.862598410.2%
 
-157.859865910.2%
 
-155.83175410.2%
 
-149.858200110.2%
 
ValueCountFrequency (%) 
-68.100006810.2%
 
-69.96642310.2%
 
-70.931617710.2%
 
-71.089922910.2%
 
-71.283142110.2%
 

state_fp
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count47
Unique (%)10.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.342612419700213
Minimum1
Maximum56
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum1
5-th percentile4
Q18
median24
Q340
95-th percentile51
Maximum56
Range55
Interquartile range (IQR)32

Descriptive statistics

Standard deviation16.76645832
Coefficient of variation (CV)0.6615915534
Kurtosis-1.415458598
Mean25.34261242
Median Absolute Deviation (MAD)16
Skewness0.1777118785
Sum11835
Variance281.1141245
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
67415.8%
 
484710.1%
 
12296.2%
 
4255.4%
 
40224.7%
 
13163.4%
 
36132.8%
 
8122.6%
 
34112.4%
 
53112.4%
 
Other values (37)20744.3%
 
ValueCountFrequency (%) 
181.7%
 
220.4%
 
4255.4%
 
540.9%
 
67415.8%
 
ValueCountFrequency (%) 
5610.2%
 
5551.1%
 
5420.4%
 
53112.4%
 
5191.9%
 

county_fp
Real number (ℝ≥0)

Distinct count113
Unique (%)24.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean91.5845824411135
Minimum1
Maximum740
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum1
5-th percentile3
Q129
median63
Q3111
95-th percentile362.2
Maximum740
Range739
Interquartile range (IQR)82

Descriptive statistics

Standard deviation110.1851291
Coefficient of variation (CV)1.203096921
Kurtosis9.16060243
Mean91.58458244
Median Absolute Deviation (MAD)40
Skewness2.79103332
Sum42770
Variance12140.76268
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
37234.9%
 
13173.6%
 
3173.6%
 
1163.4%
 
71163.4%
 
29143.0%
 
73112.4%
 
591.9%
 
1191.9%
 
3191.9%
 
Other values (103)32669.8%
 
ValueCountFrequency (%) 
1163.4%
 
3173.6%
 
591.9%
 
720.4%
 
951.1%
 
ValueCountFrequency (%) 
74020.4%
 
55010.2%
 
51040.9%
 
48510.2%
 
47910.2%
 

tract_ce
Real number (ℝ≥0)

Distinct count389
Unique (%)83.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean236936.61456102785
Minimum100
Maximum980000
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum100
5-th percentile403.3
Q15201.5
median40200
Q3378450
95-th percentile960200
Maximum980000
Range979900
Interquartile range (IQR)373248.5

Descriptive statistics

Standard deviation341262.7217
Coefficient of variation (CV)1.440312306
Kurtosis0.07448872917
Mean236936.6146
Median Absolute Deviation (MAD)39399
Skewness1.29304822
Sum110649399
Variance1.164602452e+11
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
20051.1%
 
200040.9%
 
120040.9%
 
96060040.9%
 
70040.9%
 
40040.9%
 
95030040.9%
 
96020030.6%
 
1060030.6%
 
97030030.6%
 
Other values (379)42991.9%
 
ValueCountFrequency (%) 
10030.6%
 
10120.4%
 
11010.2%
 
20051.1%
 
20210.2%
 
ValueCountFrequency (%) 
98000020.4%
 
97950010.2%
 
97840010.2%
 
97050110.2%
 
97050020.4%
 

geo_id
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count458
Unique (%)98.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25434433938.75589
Minimum1003010300
Maximum56005000700
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum1003010300
5-th percentile4013452482
Q18022008307
median2.403380352e+10
Q34.01124703e+10
95-th percentile5.107803147e+10
Maximum5.60050007e+10
Range5.50019904e+10
Interquartile range (IQR)3.2090462e+10

Descriptive statistics

Standard deviation1.680139797e+10
Coefficient of variation (CV)0.6605768389
Kurtosis-1.413977994
Mean2.543443394e+10
Median Absolute Deviation (MAD)1.60753037e+10
Skewness0.1785203481
Sum1.187788065e+13
Variance2.822869738e+20
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
4.8467951e+1030.6%
 
4.01430025e+1020.4%
 
606504330820.4%
 
402500200220.4%
 
401311620520.4%
 
5.30210202e+1020.4%
 
3.40030546e+1020.4%
 
4.811301903e+1020.4%
 
5.30730102e+1010.2%
 
105103090210.2%
 
Other values (448)44895.9%
 
ValueCountFrequency (%) 
100301030010.2%
 
105103090210.2%
 
107300200010.2%
 
107300500010.2%
 
107301441210.2%
 
ValueCountFrequency (%) 
5.60050007e+1010.2%
 
5.50790209e+1010.2%
 
5.50590012e+1010.2%
 
5.50390403e+1010.2%
 
5.50250019e+1010.2%
 

county_id
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count300
Unique (%)64.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25434.19700214133
Minimum1003
Maximum56005
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum1003
5-th percentile4013
Q18022
median24033
Q340112
95-th percentile51077.4
Maximum56005
Range55002
Interquartile range (IQR)32090

Descriptive statistics

Standard deviation16801.37976
Coefficient of variation (CV)0.6605822764
Kurtosis-1.413976483
Mean25434.197
Median Absolute Deviation (MAD)16076
Skewness0.178523607
Sum11877770
Variance282286361.7
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
6037204.3%
 
4013122.6%
 
602981.7%
 
4820171.5%
 
607171.5%
 
605961.3%
 
4843961.3%
 
4010951.1%
 
3400351.1%
 
608540.9%
 
Other values (290)38782.9%
 
ValueCountFrequency (%) 
100310.2%
 
105110.2%
 
107330.6%
 
107910.2%
 
108910.2%
 
ValueCountFrequency (%) 
5600510.2%
 
5507910.2%
 
5505910.2%
 
5503910.2%
 
5502510.2%
 

namelsad
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count389
Unique (%)83.3%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
Census Tract 2
 
5
Census Tract 20
 
4
Census Tract 9606
 
4
Census Tract 7
 
4
Census Tract 12
 
4
Other values (384)
446
ValueCountFrequency (%) 
Census Tract 251.1%
 
Census Tract 2040.9%
 
Census Tract 960640.9%
 
Census Tract 740.9%
 
Census Tract 1240.9%
 
Census Tract 440.9%
 
Census Tract 950340.9%
 
Census Tract 960230.6%
 
Census Tract 970330.6%
 
Census Tract 1130.6%
 
Other values (379)42991.9%
 

Length

Max length20
Median length17
Mean length17.44325482
Min length14

lawenforcementagency
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count377
Unique (%)80.7%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
Los Angeles Police Department
 
10
Los Angeles County Sheriff's Department
 
5
Oklahoma City Police Department
 
5
US Marshals Service
 
5
Indianapolis Metropolitan Police Department
 
4
Other values (372)
438
ValueCountFrequency (%) 
Los Angeles Police Department102.1%
 
Los Angeles County Sheriff's Department51.1%
 
Oklahoma City Police Department51.1%
 
US Marshals Service51.1%
 
Indianapolis Metropolitan Police Department40.9%
 
New York Police Department40.9%
 
Fort Worth Police Department40.9%
 
Phoenix Police Department40.9%
 
Jefferson Parish Sheriff's Office30.6%
 
Riverside County Sheriff's Department30.6%
 
Other values (367)42089.9%
 

Length

Max length95
Median length28
Mean length31.38543897
Min length3

cause
Categorical

Distinct count5
Unique (%)1.1%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
Gunshot
411
Taser
 
27
Death in custody
 
14
Struck by vehicle
 
12
Unknown
 
3
ValueCountFrequency (%) 
Gunshot41188.0%
 
Taser275.8%
 
Death in custody143.0%
 
Struck by vehicle122.6%
 
Unknown30.6%
 

Length

Max length17
Median length7
Mean length7.411134904
Min length5

armed
Categorical

Distinct count8
Unique (%)1.7%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
Firearm
230
No
102
Knife
68
Other
 
26
Vehicle
 
18
Other values (3)
 
23
ValueCountFrequency (%) 
Firearm23049.3%
 
No10221.8%
 
Knife6814.6%
 
Other265.6%
 
Vehicle183.9%
 
Non-lethal firearm143.0%
 
Unknown71.5%
 
Disputed20.4%
 

Length

Max length18
Median length7
Mean length5.839400428
Min length2

pop
Real number (ℝ≥0)

Distinct count445
Unique (%)95.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4783.719486081371
Minimum0
Maximum26826
Zeros2
Zeros (%)0.4%
Memory size3.6 KiB

Quantile statistics

Minimum0
5-th percentile1915.2
Q13357.5
median4447
Q35815.5
95-th percentile8502.7
Maximum26826
Range26826
Interquartile range (IQR)2458

Descriptive statistics

Standard deviation2374.565749
Coefficient of variation (CV)0.4963848226
Kurtosis18.1225437
Mean4783.719486
Median Absolute Deviation (MAD)1209
Skewness2.697154221
Sum2233997
Variance5638562.494
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
685230.6%
 
020.4%
 
457920.4%
 
215020.4%
 
288620.4%
 
372120.4%
 
428420.4%
 
429320.4%
 
881020.4%
 
154820.4%
 
Other values (435)44695.5%
 
ValueCountFrequency (%) 
020.4%
 
40310.2%
 
67810.2%
 
73210.2%
 
127110.2%
 
ValueCountFrequency (%) 
2682610.2%
 
1816810.2%
 
1398710.2%
 
1356110.2%
 
1235110.2%
 

share_white
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count363
Unique (%)77.7%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
87.2
 
5
52.2
 
4
58.6
 
4
39.3
 
3
75.3
 
3
Other values (358)
448
ValueCountFrequency (%) 
87.251.1%
 
52.240.9%
 
58.640.9%
 
39.330.6%
 
75.330.6%
 
67.930.6%
 
5.730.6%
 
5.130.6%
 
9030.6%
 
8.530.6%
 
Other values (353)43392.7%
 

Length

Max length4
Median length4
Mean length3.644539615
Min length1

share_black
Categorical

HIGH CARDINALITY

Distinct count246
Unique (%)52.7%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
0
 
27
0.9
 
13
0.3
 
12
0.2
 
9
0.6
 
9
Other values (241)
397
ValueCountFrequency (%) 
0275.8%
 
0.9132.8%
 
0.3122.6%
 
0.291.9%
 
0.691.9%
 
0.181.7%
 
1.881.7%
 
0.471.5%
 
1.361.3%
 
1.461.3%
 
Other values (236)36277.5%
 

Length

Max length4
Median length3
Mean length3.100642398
Min length1

share_hispanic
Categorical

HIGH CARDINALITY

Distinct count293
Unique (%)62.7%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
0
 
18
0.3
 
6
1.5
 
6
2.9
 
6
10.4
 
6
Other values (288)
425
ValueCountFrequency (%) 
0183.9%
 
0.361.3%
 
1.561.3%
 
2.961.3%
 
10.461.3%
 
3.161.3%
 
1.151.1%
 
0.740.9%
 
640.9%
 
2.240.9%
 
Other values (283)40286.1%
 

Length

Max length4
Median length3
Mean length3.293361884
Min length1

p_income
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count452
Unique (%)96.8%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
22969
 
3
15373
 
2
16415
 
2
23243
 
2
14256
 
2
Other values (447)
456
ValueCountFrequency (%) 
2296930.6%
 
1537320.4%
 
1641520.4%
 
2324320.4%
 
1425620.4%
 
1655820.4%
 
2812520.4%
 
1763220.4%
 
2113020.4%
 
3747620.4%
 
Other values (442)44695.5%
 

Length

Max length5
Median length5
Mean length4.974304069
Min length1

h_income
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count454
Unique (%)97.6%
Missing2
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean46627.18279569892
Minimum10290.0
Maximum142500.0
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum10290
5-th percentile20577.8
Q132625
median42759
Q356190
95-th percentile83415.4
Maximum142500
Range132210
Interquartile range (IQR)23565

Descriptive statistics

Standard deviation20511.19491
Coefficient of variation (CV)0.4398977952
Kurtosis2.938621319
Mean46627.1828
Median Absolute Deviation (MAD)11502
Skewness1.35029416
Sum21681640
Variance420709116.5
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
5028030.6%
 
9309120.4%
 
2853720.4%
 
5151820.4%
 
3681820.4%
 
4997320.4%
 
3443220.4%
 
4600720.4%
 
3552420.4%
 
3270820.4%
 
Other values (444)44495.1%
 
(Missing)20.4%
 
ValueCountFrequency (%) 
1029010.2%
 
1093110.2%
 
1137810.2%
 
1521210.2%
 
1529310.2%
 
ValueCountFrequency (%) 
14250010.2%
 
13875010.2%
 
13562510.2%
 
11819410.2%
 
11741310.2%
 

county_income
Real number (ℝ≥0)

Distinct count299
Unique (%)64.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52527.33190578158
Minimum22545
Maximum110292
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum22545
5-th percentile34582
Q143804
median50856
Q356832
95-th percentile77464.5
Maximum110292
Range87747
Interquartile range (IQR)13028

Descriptive statistics

Standard deviation12948.26381
Coefficient of variation (CV)0.2465052638
Kurtosis1.88319875
Mean52527.33191
Median Absolute Deviation (MAD)6145
Skewness1.078794502
Sum24530264
Variance167657535.7
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
55909204.3%
 
53596122.6%
 
4855281.7%
 
5313771.5%
 
5409071.5%
 
5685361.3%
 
7542261.3%
 
4521551.1%
 
8379451.1%
 
4233440.9%
 
Other values (289)38782.9%
 
ValueCountFrequency (%) 
2254510.2%
 
2492710.2%
 
2549810.2%
 
2687710.2%
 
2976910.2%
 
ValueCountFrequency (%) 
11029210.2%
 
10320810.2%
 
9822110.2%
 
9170240.9%
 
8820210.2%
 

comp_income
Real number (ℝ≥0)

Distinct count456
Unique (%)98.1%
Missing2
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean0.895912580064516
Minimum0.18404908
Maximum2.8652160139999996
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum0.18404908
5-th percentile0.4334039334
Q10.645365352
median0.869612089
Q31.081454256
95-th percentile1.507397811
Maximum2.865216014
Range2.681166934
Interquartile range (IQR)0.436088904

Descriptive statistics

Standard deviation0.3335837842
Coefficient of variation (CV)0.3723396585
Kurtosis2.721721337
Mean0.8959125801
Median Absolute Deviation (MAD)0.216736985
Skewness0.9383739963
Sum416.5993497
Variance0.111278141
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1.15748520930.6%
 
0.57860567120.4%
 
1.00994321120.4%
 
0.51719013420.4%
 
1.11095066520.4%
 
0.85649149720.4%
 
0.7146385520.4%
 
0.66281065820.4%
 
1.4761800710.2%
 
0.79080260610.2%
 
Other values (446)44695.5%
 
(Missing)20.4%
 
ValueCountFrequency (%) 
0.1840490810.2%
 
0.22479281110.2%
 
0.26208739310.2%
 
0.28497564610.2%
 
0.28784908910.2%
 
ValueCountFrequency (%) 
2.86521601410.2%
 
2.30870286410.2%
 
1.82113792110.2%
 
1.81096320910.2%
 
1.81059926510.2%
 

county_bucket
Real number (ℝ≥0)

MISSING

Distinct count5
Unique (%)1.1%
Missing27
Missing (%)5.8%
Infinite0
Infinite (%)0.0%
Mean2.4977272727272726
Minimum1.0
Maximum5.0
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.393114967
Coefficient of variation (CV)0.5577530349
Kurtosis-1.039237829
Mean2.497727273
Median Absolute Deviation (MAD)1
Skewness0.4947451927
Sum1099
Variance1.94076931
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
114531.0%
 
210121.6%
 
38017.1%
 
45812.4%
 
55612.0%
 
(Missing)275.8%
 
ValueCountFrequency (%) 
114531.0%
 
210121.6%
 
38017.1%
 
45812.4%
 
55612.0%
 
ValueCountFrequency (%) 
55612.0%
 
45812.4%
 
38017.1%
 
210121.6%
 
114531.0%
 

nat_bucket
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count5
Unique (%)1.1%
Missing2
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean2.496774193548387
Minimum1.0
Maximum5.0
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.298411796
Coefficient of variation (CV)0.5200357323
Kurtosis-0.9952138173
Mean2.496774194
Median Absolute Deviation (MAD)1
Skewness0.401653297
Sum1161
Variance1.685873192
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
113929.8%
 
211123.8%
 
39921.2%
 
47716.5%
 
5398.4%
 
(Missing)20.4%
 
ValueCountFrequency (%) 
113929.8%
 
211123.8%
 
39921.2%
 
47716.5%
 
5398.4%
 
ValueCountFrequency (%) 
5398.4%
 
47716.5%
 
39921.2%
 
211123.8%
 
113929.8%
 

pov
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count281
Unique (%)60.2%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
14.1
 
6
20.5
 
5
7.3
 
5
7.7
 
5
18.4
 
5
Other values (276)
441
ValueCountFrequency (%) 
14.161.3%
 
20.551.1%
 
7.351.1%
 
7.751.1%
 
18.451.1%
 
11.251.1%
 
9.551.1%
 
20.151.1%
 
17.440.9%
 
11.340.9%
 
Other values (271)41889.5%
 

Length

Max length4
Median length4
Mean length3.584582441
Min length1

urate
Real number (ℝ≥0)

Distinct count456
Unique (%)98.1%
Missing2
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean0.11739939258064516
Minimum0.011335013
Maximum0.507614213
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum0.011335013
5-th percentile0.0352971268
Q10.068592058
median0.105180534
Q30.140832976
95-th percentile0.2557933014
Maximum0.507614213
Range0.4962792
Interquartile range (IQR)0.072240918

Descriptive statistics

Standard deviation0.06917513274
Coefficient of variation (CV)0.5892290515
Kurtosis4.789264266
Mean0.1173993926
Median Absolute Deviation (MAD)0.036406269
Skewness1.694841819
Sum54.59071755
Variance0.004785198989
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0.05857329830.6%
 
0.14921946720.4%
 
0.09214891320.4%
 
0.21619433220.4%
 
0.21684867420.4%
 
0.13174311920.4%
 
0.08068669520.4%
 
0.06643499620.4%
 
0.21418338110.2%
 
0.08822585810.2%
 
Other values (446)44695.5%
 
(Missing)20.4%
 
ValueCountFrequency (%) 
0.01133501310.2%
 
0.01191611110.2%
 
0.01519756810.2%
 
0.01620370410.2%
 
0.01819322510.2%
 
ValueCountFrequency (%) 
0.50761421310.2%
 
0.44807467910.2%
 
0.43959044410.2%
 
0.42232142910.2%
 
0.34877734910.2%
 

college
Real number (ℝ≥0)

Distinct count455
Unique (%)97.8%
Missing2
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean0.22021668519569892
Minimum0.013547237
Maximum0.828070175
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum0.013547237
5-th percentile0.0496061808
Q10.106167401
median0.169544297
Q30.284542314
95-th percentile0.5382201012
Maximum0.828070175
Range0.814522938
Interquartile range (IQR)0.178374913

Descriptive statistics

Standard deviation0.1583472315
Coefficient of variation (CV)0.7190519256
Kurtosis1.746945788
Mean0.2202166852
Median Absolute Deviation (MAD)0.073944904
Skewness1.391802537
Sum102.4007586
Variance0.02507384573
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0.10024906630.6%
 
0.13952077620.4%
 
0.40877437320.4%
 
0.48717094320.4%
 
0.0859060420.4%
 
0.14285714320.4%
 
0.31563890720.4%
 
0.05725658320.4%
 
0.24432029820.4%
 
0.07275132310.2%
 
Other values (445)44595.3%
 
(Missing)20.4%
 
ValueCountFrequency (%) 
0.01354723710.2%
 
0.01411764710.2%
 
0.01588594710.2%
 
0.02266839410.2%
 
0.02640374310.2%
 
ValueCountFrequency (%) 
0.82807017510.2%
 
0.82497116510.2%
 
0.79406986210.2%
 
0.77906336110.2%
 
0.75511038410.2%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

nameagegenderraceethnicitymonthdayyearstreetaddresscitystatelatitudelongitudestate_fpcounty_fptract_cegeo_idcounty_idnamelsadlawenforcementagencycausearmedpopshare_whiteshare_blackshare_hispanicp_incomeh_incomecounty_incomecomp_incomecounty_bucketnat_bucketpovuratecollege
0A'donte Washington16MaleBlackFebruary232015Clearview LnMillbrookAL32.529577-86.3628291513090210510309021051Census Tract 309.02Millbrook Police DepartmentGunshotNo377960.530.55.62837551367.0547660.9379363.03.014.10.0976860.168510
1Aaron Rutledge27MaleWhiteApril22015300 block Iris Park DrPinevilleLA31.321739-92.4348602279117002207901170022079Census Tract 117Rapides Parish Sheriff's OfficeGunshotNo276953.836.20.51467827972.0409300.6834112.01.028.80.0657240.111402
2Aaron Siler26MaleWhiteMarch14201522nd Ave and 56th StKenoshaWI42.583560-87.835710555912005505900120055059Census Tract 12Kenosha Police DepartmentGunshotNo407973.87.716.82528645365.0549300.8258692.03.014.60.1662930.147312
3Aaron Valdez25MaleHispanic/LatinoMarch1120153000 Seminole AveSouth GateCA33.939298-118.21946363753560760375356076037Census Tract 5356.07South Gate Police DepartmentGunshotFirearm43431.20.698.81719448295.0559090.8638143.03.011.70.1248270.050133
4Adam Jovicic29MaleWhiteMarch192015364 Hiwood AveMunroe FallsOH41.148575-81.429878391535308003915353080039153Census Tract 5308Kent Police DepartmentGunshotNo680992.51.41.73395468785.0496691.3848685.04.01.90.0635500.403954
5Adam Reinhart29MaleWhiteMarch7201518th St and Palm LnPhoenixAZ33.469380-112.04332041311160240131116024013Census Tract 1116.02Phoenix Police DepartmentGunshotNo468277.7791552320833.0535960.3887041.01.0580.0736510.102955
6Adrian Hernandez22MaleHispanic/LatinoMarch2720154000 Union AveBakersfieldCA35.395697-119.00274562970060290007006029Census Tract 7Bakersfield Police DepartmentGunshotFirearm502750.80.344.22594958068.0485521.1959964.04.017.20.1314610.203801
7Adrian Solis35MaleHispanic/LatinoMarch2620151500 Bayview AveWilmingtonCA33.793050-118.27092663729420060372942006037Census Tract 2942Los Angeles Police DepartmentGunshotNon-lethal firearm52388.60.284.12504366543.0559091.1902024.04.012.20.0943470.090438
8Alan Alverson44MaleWhiteJanuary282015Pickett Runn RdSunsetTX30.665304-96.40148248416034804100060348041Census Tract 6.03Wise County Sheriff's Department and Texas DPSGunshotFirearm483214.617.766.31677830391.0383100.7932922.01.037.70.1408330.047601
9Alan James31MaleWhiteFebruary72015200 Abbie St SEWyomingMI42.893238-85.6605842681142002608101420026081Census Tract 142Kentwood Police Department and Wyoming DPSGunshotOther379563.67.726.52200544553.0516670.8623113.02.018.40.1741670.102692

Last rows

nameagegenderraceethnicitymonthdayyearstreetaddresscitystatelatitudelongitudestate_fpcounty_fptract_cegeo_idcounty_idnamelsadlawenforcementagencycausearmedpopshare_whiteshare_blackshare_hispanicp_incomeh_incomecounty_incomecomp_incomecounty_bucketnat_bucketpovuratecollege
457Walter Scott50MaleBlackApril420151945 Remount RdNorth CharlestonSC32.899113-80.013802451933004501900330045019Census Tract 33North Charleston Police DepartmentGunshotKnife411015.865.218.61496319988.0507920.3935271.01.044.10.2159090.092950
458Wendell King40MaleWhiteJanuary2920154800 Hildring Dr EForth WorthTX32.678600-97.380737484391054044843910540448439Census Tract 1054.04Fort Worth Police DepartmentGunshotUnknown407489.20.9654494102938.0568531.8105995.05.011.20.0275750.711270
459Wilber Castillo-Gongora35MaleHispanic/LatinoFebruary52015US-287ElectraTX34.044584-98.93119848485138004848501380048485Census Tract 138Wichita County Sheriff's OfficeTaserUnknown350794.802.53176162817.0450861.3932715.04.040.0345200.169804
460William 'Rusty' Smith53MaleWhiteMarch102015700 Valley StHooverAL33.414931-86.8515571731441210730144121073Census Tract 144.12Hoover Police DepartmentGunshotNo454784.71.47.74231281996.0454291.8049265.05.030.0359160.627632
461William Campbell59MaleUnknownJanuary252015335 New Brooklyn RdBerlinNJ39.749119-74.9293063476091033400760910334007Census Tract 6091.03Winslow Police DepartmentGunshotNon-lethal firearm506564.316.94.93147065759.0616831.0660803.04.07.30.1408660.236107
462William Chapman II18MaleBlackApril2220151098 Frederick BlvdPortsmouthVA36.829014-76.341438517402115005174021150051740Census Tract 2115Portsmouth Police DepartmentGunshotNo164040.953.802526227418.0461660.5939001.01.035.20.1520470.120553
463William Dick III28MaleNative AmericanApril42015Bureau of Indian Affairs Rd 66TonasketWA48.708542-119.43682953479704005304797040053047Census Tract 9704US Forest ServiceTaserFirearm415474.50.420.21847035608.0403680.8820851.02.027.30.1336500.174525
464William Poole52MaleWhiteMarch162015130 Wedowee LnGastonNC35.205776-81.2406693771317043707103170437071Census Tract 317.04Gaston County Police DepartmentGunshotFirearm385083.210.10.32117538200.0420170.9091562.02.028.50.2561500.072764
465Yuvette Henderson38FemaleBlackFebruary320153800 Hollis StOaklandCA37.827129-122.2844926140170060014017006001Census Tract 4017Emeryville Police DepartmentGunshotFirearm254421.724.937.12697163052.0721120.8743622.04.023.90.0696010.396476
466Zaki Shinwary48MaleUnknownJanuary162015Lake Arrowhead Ave and Great Salt Lake DrFremontCA37.586471-122.0600106144152260014415226001Census Tract 4415.22Fremont Police DepartmentGunshotFirearm517723.1412.33359088940.0721121.2333594.05.06.10.0809120.435773